Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cron job to parse nippon mutual funds #24

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Sama-004
Copy link

@Sama-004 Sama-004 commented Oct 12, 2024

resolves #23

@Sama-004
Copy link
Author

Sama-004 commented Oct 12, 2024

can parse all the links that are inside the href for monthly portfolio for the month end you can see the output here

Just need to download the files, parse and upload to cloudinary. I can pick complete this tomorrow (haven't setup cloudinary and mongo yet). @cu8code do you want to collab?

@cu8code
Copy link

cu8code commented Oct 12, 2024

Sure looks good! But I will probably need access to push to your repo. I will try to download and parse the files, let's see!

@shivamsouravjha
Copy link
Owner

  • this parsing is good no doubt will have to find platform that can help us run this cron job (or else we can put it in a way that when server keeps running, we check if its the first day of the month, then we check and parse(or scrap) the new data).

@cu8code
Copy link

cu8code commented Oct 12, 2024

If we use external platform, we can not use Golang directly in the platform! But if we expose our service as URL, and then this external platform can call our URL to run the script. We can use make.com or n8n to call our URL!

@shivamsouravjha any alternative solution

@shivamsouravjha
Copy link
Owner

i think we can make this in house as well in that case. Because we are using ticker anyways (to keep server alive as it would die in 50 seconds of inactivity)

@Sama-004
Copy link
Author

@cu8code let me know how can i help. if you're on discord then add me: sama004

@cu8code cu8code force-pushed the feat/cron-parse-mfs branch from 41db6de to 23a2222 Compare October 13, 2024 04:29
@Sama-004
Copy link
Author

@shivamsouravjha can you give a some more info on what to do now with mongo

@shivamsouravjha
Copy link
Owner

Since you store the excel sheet and it is of a particular month we can store the file and its cloudinary link in mongo

name can be - name of MF+ month (complete name, so that its not duplicate and with a simple regex we can find)

fields can be unique name - mf complete name + month

month
Comlplete name
Cloudinary link
uuid
Fund house (like here nippon)

(here we can use primary + secondary key technique on complete bame and month as to not have additional field of unique name )

@Sama-004
Copy link
Author

Sama-004 commented Oct 15, 2024

Since you store the excel sheet and it is of a particular month we can store the file and its cloudinary link in mongo

name can be - name of MF+ month (complete name, so that its not duplicate and with a simple regex we can find)

fields can be unique name - mf complete name + month

month Comlplete name Cloudinary link uuid Fund house (like here nippon)

(here we can use primary + secondary key technique on complete bame and month as to not have additional field of unique name )

facing some issues with cloudinary. will fix those and implement mongo soon. Can you check once if the cloudinary implementation is okay or not?

@Sama-004
Copy link
Author

Sama-004 commented Oct 21, 2024

@shivamsouravjha

image

we are parsing the date from the filename using regex but it is not working for some entries where the filename is weird

IN_MF_RLMF_MONTHLY_PORTFOLIO_REPORT-%25282%2529.xls
RVSD-IN_MF_RLMF_MONTHLY_PORTFOLIO_REPORT-revised.xls
RelianceMonthlyPortfolios31122013.xls
RelianceMonthlyPortfolios31102013.xls
RelianceMonthlyPortfolios31032014.xls
Reliance-Monthly-Portfolios-30062015.xls
Reliance-Monthly-Portfolios-31052015.xls
Reliance-Monthly-Portfolios-30042015.xls
Reliance%2520Monthly%2520Portfolios-31.03.2015-1.xlsx
Reliance-Monthly-Portfolios-28022015.xls
Reliance-Monthly-Portfolios-31012015.xls
Reliance-Monthly-Portfolios-31122014.xls
Reliance-Monthly-Portfolios-30092015.xls
Reliance-Monthly-Portfolios-31082015.xls
Reliance-Monthly-Portfolios-31072015.xls
Reliance%2520Monthly%2520Portfolios-30.11.2014.xls
Reliance%2520Monthly%2520Portfolios-31.10.2014.xls
RTFOLIO-MAR-23.xls
MONTHLY-PORTFOLIO-FEB-23.xls
NIMF-Portfolio-with-Rikometer-March-21.xlsx
Portfolio-June-With-riskometer.xls

These are the files where the regex doesn't work so the month field remains empty in this case.

also can you clear the fund house thing? will it be nippon for every entry?

let me know what i've to change

edit: also do you want the _id to be the custom uuid or it is fine as it is rn

@shivamsouravjha
Copy link
Owner

  1. we would be targeting each fund individually as of now nippon in first priority.
  2. since we're just targetting nippon with this one we won't face the month problem.
  3. custom uuid is better any day.

@Sama-004
Copy link
Author

Sama-004 commented Oct 28, 2024

image
@shivamsouravjha done

  1. package is still main
  2. to test it without cron
func main() {
	performUploadTask()
}
  1. add DATABASE_NAME and COLLECTION_NAME in the env

@shivamsouravjha
Copy link
Owner

sure @Sama-004 testing the same! sorry for the delay past few weeks were a bit too busy.!

@shivamsouravjha
Copy link
Owner

the build is breaking perhaps you should update the package name , and also link this function to the mail files

Sama-004 and others added 8 commits November 5, 2024 00:25
- Doesn't work for filenames with weird date format like

IN_MF_RLMF_MONTHLY_PORTFOLIO_REPORT-%25282%2529.xls
RVSD-IN_MF_RLMF_MONTHLY_PORTFOLIO_REPORT-revised.xls
RelianceMonthlyPortfolios31122013.xls
RelianceMonthlyPortfolios31102013.xls
RelianceMonthlyPortfolios31032014.xls
Reliance-Monthly-Portfolios-30062015.xls
Reliance-Monthly-Portfolios-31052015.xls
Reliance-Monthly-Portfolios-30042015.xls
Reliance%2520Monthly%2520Portfolios-31.03.2015-1.xlsx
Reliance-Monthly-Portfolios-28022015.xls
Reliance-Monthly-Portfolios-31012015.xls
Reliance-Monthly-Portfolios-31122014.xls
Reliance-Monthly-Portfolios-30092015.xls
Reliance-Monthly-Portfolios-31082015.xls
Reliance-Monthly-Portfolios-31072015.xls
Reliance%2520Monthly%2520Portfolios-30.11.2014.xls
Reliance%2520Monthly%2520Portfolios-31.10.2014.xls
RTFOLIO-MAR-23.xls
MONTHLY-PORTFOLIO-FEB-23.xls
NIMF-Portfolio-with-Rikometer-March-21.xlsx
Portfolio-June-With-riskometer.xls

need to add dates for these files manually
Signed-off-by: shivamsouravjha <[email protected]>
Signed-off-by: shivamsouravjha <[email protected]>
@shivamsouravjha
Copy link
Owner

also since all of these files consists the details of funds would be fun to dump the data. I will pick it if you folks are busy, basically parse the sheet break it page by page store the data in separate mongo collection

@Sama-004
Copy link
Author

Sama-004 commented Nov 6, 2024

also since all of these files consists the details of funds would be fun to dump the data. I will pick it if you folks are busy, basically parse the sheet break it page by page store the data in separate mongo collection

I can do this after 2 weeks, a bit busy rn. @cu8code what about you?

Signed-off-by: shivamsouravjha <[email protected]>
Signed-off-by: shivamsouravjha <[email protected]>
Signed-off-by: shivamsouravjha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: cron job to parse nippon mutual funds
3 participants